Apache Software Foundation

已认证@apache-software-foundation

20+ Apache top-level projects on TokRepo — Kafka, Spark, Flink, Airflow, Pulsar, Iceberg, Arrow, ECharts, APISIX. The data backbone modern AI runs on.

已 ship

6177

总阅读

spotlight 上榜次数

最近发布 · 2026-04-18

⏱️

自动化任务

Apache NiFi — Visual Dataflow Automation & Integration Platform

Apache NiFi is a powerful dataflow management system that lets you design, control, and monitor data pipelines through a drag-and-drop web interface. Built for enterprise data routing, transformation, and system mediation with provenance tracking and guaranteed delivery.

2026年4月17日

294

🧠

Skills

Apache Pinot — Real-Time Distributed OLAP Datastore

Apache Pinot is a real-time distributed OLAP datastore designed to deliver low-latency analytical queries at high throughput. It powers user-facing analytics at companies like LinkedIn, Uber, and Stripe by ingesting data from Kafka and batch sources.

2026年4月18日

313

Apache Hudi — Incremental Data Processing for Data Lakehouses

Apache Hudi (Hadoop Upserts Deletes and Incrementals) is an open-source data lakehouse platform that provides record-level insert, update, and delete capabilities on data lakes. It powers incremental pipelines, CDC ingestion, and near-real-time analytics on S3, GCS, and HDFS.

2026年4月17日

315

Apache Beam — Unified Batch and Stream Data Processing

Apache Beam is a unified programming model for defining both batch and streaming data-parallel processing pipelines. Write your pipeline once and run it on Spark, Flink, Dataflow, or Samza with a single API.

2026年4月17日

297

Apache DataFusion — Fast In-Process SQL Query Engine in Rust

An extensible query engine written in Rust that uses Apache Arrow as its in-memory format, enabling fast analytical SQL queries embeddable in any application.

2026年4月17日

322

Apache Iceberg — Open Table Format for Huge Analytical Datasets

High-performance, engine-agnostic table format that brings ACID transactions, schema evolution, and time travel to Parquet data lakes.

2026年4月16日

287

Apache SeaTunnel — High-Performance Data Integration Engine

Fast, distributed, cloud-native data integration tool for batch and streaming data synchronization across 100+ sources and sinks.

2026年4月16日

303